Picture for Yongxin Wang

Yongxin Wang

Accordion-Thinking: Self-Regulated Step Summaries for Efficient and Readable LLM Reasoning

Add code
Feb 03, 2026
Viaarxiv icon

Bridging the Knowledge-Action Gap by Evaluating LLMs in Dynamic Dental Clinical Scenarios

Add code
Jan 19, 2026
Viaarxiv icon

CARE What Fails: Contrastive Anchored-REflection for Verifiable Multimodal

Add code
Dec 22, 2025
Viaarxiv icon

PhyBlock: A Progressive Benchmark for Physical Understanding and Planning via 3D Block Assembly

Add code
Jun 10, 2025
Viaarxiv icon

EACO: Enhancing Alignment in Multimodal LLMs via Critical Observation

Add code
Dec 06, 2024
Figure 1 for EACO: Enhancing Alignment in Multimodal LLMs via Critical Observation
Figure 2 for EACO: Enhancing Alignment in Multimodal LLMs via Critical Observation
Figure 3 for EACO: Enhancing Alignment in Multimodal LLMs via Critical Observation
Figure 4 for EACO: Enhancing Alignment in Multimodal LLMs via Critical Observation
Viaarxiv icon

Web2Code: A Large-scale Webpage-to-Code Dataset and Evaluation Framework for Multimodal LLMs

Add code
Jun 28, 2024
Figure 1 for Web2Code: A Large-scale Webpage-to-Code Dataset and Evaluation Framework for Multimodal LLMs
Figure 2 for Web2Code: A Large-scale Webpage-to-Code Dataset and Evaluation Framework for Multimodal LLMs
Figure 3 for Web2Code: A Large-scale Webpage-to-Code Dataset and Evaluation Framework for Multimodal LLMs
Figure 4 for Web2Code: A Large-scale Webpage-to-Code Dataset and Evaluation Framework for Multimodal LLMs
Viaarxiv icon

Prototype-Based Layered Federated Cross-Modal Hashing

Add code
Oct 27, 2022
Figure 1 for Prototype-Based Layered Federated Cross-Modal Hashing
Figure 2 for Prototype-Based Layered Federated Cross-Modal Hashing
Viaarxiv icon

Three-Stream Joint Network for Zero-Shot Sketch-Based Image Retrieval

Add code
Apr 12, 2022
Figure 1 for Three-Stream Joint Network for Zero-Shot Sketch-Based Image Retrieval
Figure 2 for Three-Stream Joint Network for Zero-Shot Sketch-Based Image Retrieval
Figure 3 for Three-Stream Joint Network for Zero-Shot Sketch-Based Image Retrieval
Figure 4 for Three-Stream Joint Network for Zero-Shot Sketch-Based Image Retrieval
Viaarxiv icon

ViT-FOD: A Vision Transformer based Fine-grained Object Discriminator

Add code
Mar 24, 2022
Figure 1 for ViT-FOD: A Vision Transformer based Fine-grained Object Discriminator
Figure 2 for ViT-FOD: A Vision Transformer based Fine-grained Object Discriminator
Figure 3 for ViT-FOD: A Vision Transformer based Fine-grained Object Discriminator
Figure 4 for ViT-FOD: A Vision Transformer based Fine-grained Object Discriminator
Viaarxiv icon

Learning Hierarchical Graph Neural Networks for Image Clustering

Add code
Jul 17, 2021
Figure 1 for Learning Hierarchical Graph Neural Networks for Image Clustering
Figure 2 for Learning Hierarchical Graph Neural Networks for Image Clustering
Figure 3 for Learning Hierarchical Graph Neural Networks for Image Clustering
Figure 4 for Learning Hierarchical Graph Neural Networks for Image Clustering
Viaarxiv icon